Studying and Using Data Structures ================================== Two points of view: (1) Usage Abstract interface provided by each data structure a. Called an "Abstract Data Type" b. Usually represented as a Java interface (2) Implementation Various techniques for implementing the data structure The next several lectures focus on using various data structures and their Java library implementations to solve representative problems. You will make use of these data structures, also known as collections, in the next few projects. Collections/Data Structures --------------------------- What's a Collection? A group of objects (together with operations) Example: lists, stacks, queues, sets, etc. Included in the Java library: Collection, List, Set, SortedSet, ArrayList, LinkedList, TreeSet, HashSet How are collections the same? Collection defines common behaviors of the data structures Collection defines a set of methods that should be provided by any class that implements a data structure How are collections/data structures different? (1) order (List, Stack, Queue) no order (Set) (2) duplicates (List, Stack, Queue) no duplicats (Set) (3) access (Stack = LIFO, Queue = FIFO, List = position) What does the Collection interface allow you to do? Access different data structures using the same interface public interface Collection { int size(); void clear(); boolean isEmpty(); boolean add(Object x); boolean remove(Object x); boolean contains(Object x); Object[] toArray(); Iterator iterator(); } Java library classes implementing Collection: ArrayList, LinkedList, TreeSet, HashSet (a) Demo1.java make ArrayList of Strings (names) make HashSet of Integers demo add and remove Iterators --------- An iterator is an object that allows access to each object in a collection Simplest form of iterator: Loop and array index approach for (int i = 0; i < a.length; i++) System.out.println (a[i]); What is the limitation of this approach? Must know the internal structure of the collection In general, we don't know (or want to abstract) the internal structure of a collection We create an Iterator interface public interface Iterator { boolean hasNext(); Object next(); void remove(); } How do we then create an Iterator? Require the container object to implement an iterator() method that returns an iterator for that container public class MyContainer { Object [] items; int size; public Iterator iterator() { return new MyContainerIterator(this); } // other methods as needed } public interface Iterator { boolean hasNext(); Object next(); // other generic methods as desired } public class MyContainerIterator implements Iterator { private int current = 0; private MyContainer = container; MyContainerIterator(MyContainer c) { container = c; } public boolean hasNext() { return current < container.size; } public Object next() { return container[current++]; } } public static void main(String args[]) { MyContainer bag = new MyContainer(); // add elements to bag here as needed Iterator itr = bag.iterator(); while (itr.hasNext()) { // some operations that applies to each element // e.g., System.out.println(itr.next()); } } (b) Demo2.java write method to find maximum value in a Collection show that it works with different types of Collections Remarks: 1. (Comparable)x is a type cast. It changes the static type of x to Comparable. The class Comparable is an interface, with a single method compareTo. (see sec. 4.4.1, 4.4.2 & 6.4.1 for details, or section 6.4.2 for an alternative) 2. The term "factory method" is sometimes used for methods like iterator() which return objects whose type is "unknown", i.e., it is the type of the interface, but is instantiated to a concrete type in the body of the method. Lists ----- A list is a collection where the elements have both order and position. Does List have any methods in addition to those in Collection? Items on a List can be accessed based on their position (Note, Collection does not support access by position, because it wants to be generic enough to allow collections with ordered and unordered elements, position-specific and position-independent elements) public interface List extends Collection { void add(int index, Object x); Object remove(int index); int indexOf(Object x); Object get(int index); Object set(int index, Object element); ListIterator listIterator(int pos); } Class Exercise Write a reverse method for List static void reverse(List list) { // code goes here // use the methods of List } (a) Demo3.java implement reverse method for List make ArrayList of Strings (names) make LinkedList of Integers show it works with both What's a ListIterator? An iterator over a List that gives the elements in order (regular Iterators do not guarantee any particular order) A ListIterator can iterate both forward and backward Does ListIterator have any methods in addition to those in Iterator? public interface ListIterator extends Iterator { boolean hasPrevious(); Object previous(); } Why are there two implementations of List (ArrayList, LinkedList)? ArrayList is good for get/set LinkedList is good for insert/remove in middle of list (b) Timedemo-List.java show time differences for ArrayList and LinkedList timedemo get on LinkedList timedemo insert on ArrayList Sets ---- A set is a collection of elements with no duplicates. It typically is unordered, but one may impose an order on the elements if desired. Does Set have any methods in addition to those in Collection? No new methods are needed public interface Set extends Collection { } Why make a new interface with no methods? Why not just implement the Collection interface to make a Set? Classes that implement Set will not allow duplicates Does SortedSet have any methods in addition to those in Set? First and last items can be defined public interface SortedSet extends Set { Object first(); Object last(); } HashSet implements Set TreeSet implements SortedSet (a) Demo4.java show both an unordered and an ordered set What can you say about the order of an Iterator on a HashSet? Elements are processed in no particular order What can you say about the order of an Iterator on a TreeSet? Elements are processed in compareTo order Both HashSet and TreeSet implement Set Are there any interface-visible differences besides order? Theoretically HashSet is constant time Theoretically TreeSet is logN time (b) Timedemo-Set.java show time differences for HashSet and TreeSet NOTE: our theory does not always hold in practice (here, HashSet has worse performance that TreeSet; some of the assumptions we make for the bigOh analysis do not hold) Important Implementation Remarks -------------------------------- Note that (Collection) classes in the standard java library already implement the methods discussed below. However, when you write your own (Collection) classes, you must implement these methods. toString method Implement: Always Reason: Allows readable display of objects' contents equals method Implement: essentially always Collection, List, Set, SortedSet, ArrayList, LinkedList, TreeSet, HashSet Reason: Contains and remove methods need equals method to work Comparable interface Implement: SortedSet, TreeSet Reason: The implementation keeps the elements sorted hashCode method Implement: HashSet Reason: HashSet uses hashCode to organize the placement of its elements so they can be added, removed, and found quickly Rules: If two objects are ".equals", hashCode must return the same value If two objects are not ".equals", hashCode may return different values. If hashCode returns the same value, HashSet will be less efficient Stacks ------ A stack is a List restricted to adding/removing only at the top Only the top item on the stack is accessible LIFO (last in, first out) access Items come out in reverse order Examples of usage Stack of plates Method calls Reversing lists - Think back to the program above Balancing parentheses Generally, useful when you need to process data in reverse order. What are valid operations on a Stack? push, pop, top all are constant time O(1) public interface Stack { void push(Object x); Object pop(); Object peek(); boolean isEmpty(); } How does Stack fit into the Java Collections hierarchy? java.util.Stack extends Vector fine, but rather slow implement as a LinkedList use only addFirst, removeFirst and getFirst Balanced Symbol Checker Missing brackets ( ), ], } ) often cause compilers to print many unrelated error messages A balanced symbol checker finds these problems Assume the program ignores all symbols except brackets Balanced Symbol Checker Algorithm (1) create an empty stack (2) for each opening symbol, push it on the stack (3) for each closing symbol, compare it with the top of stack (a) if the stack is empty, report error (b) else pop the stack (i) if the symbols don't match, report error (4) if stack is not empty at the end of file, report error (a) Balanced.java Simple Calculator Infix expression (e.g., 1 + 2 - 3 * 4 / 5) Algorithm for evaluating infix expressions (1) precedence (* and / have higher precedence than + and -) (2) associativity (all of these operators associate left-to-right) Can you write the expression so that precedence and associativity rules are not needed? (1) always fully parenthesize expressions (worse than rules?) (2) use Postfix notation Convert the expression to postfix. 1 + 2 - 3 * 4 / 5 --> (1 + 2) - ((3 * 4) / 5 ) --> 1 2 + 3 4 * 5 / - Postfix Expression Evaluation Algorithm (1) create an empty stack (2) Scan the expression left-to-right (3) If the symbol is an operand, push it on the stack (4) If the symbol is an operator (a) pop the top two values off the stack (b) perform the operator (first popped is right operand, second is left) (c) push the result on the stack (d) if the stack has less than two values, report error (5) After reading the expression, the answer is on the stack (a) if the stack has more or less than one value, error Evaluate the postfix expression [1 2 + 3 4 * 5 / -] using a Stack (b) Postfix.java evaluate postfix expression How can you use the postfix evaluator to evaluate infix expressions? (1) convert infix expression to postfix (2) use postfix evaluator Algorithm for converting Infix to Postfix? (1) Create a stack for storing operators (2) Initialize the Postfix expression to the empty string (3) Scan the Infix expression from left-to-right (4) If the symbol is an operand append it to the Postfix expression (5) If the symbol is an operator a. while the operator on the stack has precedence >= pop the operator and append it to the Postfix expression b. push the new operator onto the stack (6) If the symbol is a left paren push the left paren on the stack (7) If the symbol is a right paren pop the stack and append to output until left paren (8) If at the end of the Infix expression pop operators left on the stack and append to output Convert the infix expression [1 + 2 - 3 * 4 / 5] to postfix using the algorithm (b) Postfix.java infix to postfix translation Queues ------ A queue is a List restricted to removing only at the front and adding only at the back FIFO (first in, first out) access Items come out in the order they came in How is a Queue similar to, but also different from, a Stack? Both Queues and Stacks are restricted Lists Stack restricts both add and remove only at top Examples of usage waiting in line print queue for shared printer request queue for web server Generally, useful when fair service is required. What are valid operations on a Queue? enqueue, dequeue all are constant time O(1) public interface Queue { void enqueue (Object x); Object dequeue (); Object getFront(); boolean isEmpty(); void clear(); int size(); } There is no Queue interface or Queue class in the java library. Can you implement a Queue using some other class? LinkedList enqueue() is addLast() dequeue() is removeFirst() getFront() is getFirst() (a) FIFO-ER1.java (Patient class) FIFO-ER2.java (Emergency Room) Priority Queues --------------- A Priority Queue is a Queue where access is based on priority Each item in the queue is assigned a priority Item with highest priority is next out Examples of usage Simulation Generally useful when fairness based on arrival time is not most efficient (e.g., size of a request may affect its service time) What are valid operations on a Priority Queue? insert, findMin, deleteMin public interface PriorityQueue { void insert (Object x); Object findMin(); Object deleteMin(); boolean isEmpty(); void clear(); int size(); } There is no Priority Queue interface or class in the java library. Can you implement a Priority Queue using some other class? SortedSet (almost works) only works with no duplicate priorities insert() is add() findMin() is first() deleteMin() is first() and remove() (a) Priority-ER1.java (revised Patient class) Priority-ER2.java (Priority Emergency Room) Maps ---- A map is a set of pairs Keys must be unique Values need not be unique Examples of usage Phone book (name -> number) Dictionary (word -> definition) Should Map extend Collection? Not really A map is a collection of pairs, BUT We don't want to work directly with the pairs We want to work with keys and values independently What are valid operations on a Map? put, get, remove public interface Map { int size(); void clear(); boolean isEmpty(); Object put(Object key, Object value); Object get(Object key); Object remove(Object key); boolean containsKey(Object key); Set keySet(); // the set of keys Collection values(); // the collection of values Set entrySet(); // the set of pairs (type Map.Entry) public interface Entry { // the type definition for pairs Object getKey(); Object getValue(); } } Why does Map not provide an iterator factory method? We don't wan't to iterate over the pairs We want to iterate over keys (a) Demo5.java (iterating over a Map) (show both iterators) What if we wanted to iterate over the pairs? Use entrySet().iterator() The next() method of this iterator returns an object of type Map.Entry One potential use of Maps is to record item frequencies (b) Demo6a.java (reading words into a set) Demo6b.java (word frequency counter) Sorted Maps ----------- A Sorted Map is a Map whose elements are sorted by key both keySet and entrySet return Sorted Sets Does SortedMap have any methods in addition to those in Map? order-specific methods (only for keys) public interface SortedMap extends Map { Object firstKey(); Object lastKey(); } SortedMap extends Map. There are two implementations: HashMap implements Map (i.e., no order implied) and TreeMap implements SortedMap (i.e., order is enforced) Using SortedMap to generate cross-references What's a Cross-Reference Generator? Read source file Find identifiers Output identifiers in sorted order For each identifier X, list line numbers where X occurs What data types could be used in a Cross-Reference Generator? Map to store (identifier/list of line numbers) pairs List to store list of line numbers What's the algorithm used in a Cross-Reference Generator? loop reading identifiers if identifier not in Map make new List add (identifier/List) to Map add line number to List iterate over Map output identifier iterate over List output line number (c) Xref.java (use with preamble.txt)